An Efficient and Robust Method for Chest X-ray Rib Suppression That Improves Pulmonary Abnormality Diagnosis

Background: Suppression of thoracic bone shadows on chest X-rays (CXRs) can improve the diagnosis of pulmonary disease. Previous approaches can be categorized as either unsupervised physical models or supervised deep learning models. Physical models can remove the entire ribcage and preserve the morphological lung details but are impractical due to the extremely long processing time. Machine learning (ML) methods are computationally efficient but are limited by the available ground truth (GT) for effective and robust training, resulting in suboptimal results. Purpose: To improve bone shadow suppression, we propose a generalizable yet efficient workflow for CXR rib suppression by combining physical and ML methods. Materials and Method: Our pipeline consists of two stages: (1) pair generation with GT bone shadows eliminated by a physical model in spatially transformed gradient fields; and (2) a fully supervised image denoising network trained on stage-one datasets for fast rib removal from incoming CXRs. For stage two, we designed a densely connected network called SADXNet, combined with a peak signal-to-noise ratio and a multi-scale structure similarity index measure as the loss function to suppress the bony structures. SADXNet organizes the spatial filters in a U shape and preserves the feature map dimension throughout the network flow. Results: Visually, SADXNet can suppress the rib edges near the lung wall/vertebra without compromising the vessel/abnormality conspicuity. Quantitively, it achieves an RMSE of ~0 compared with the physical model generated GTs, during testing with one prediction in <1 s. Downstream tasks, including lung nodule detection as well as common lung disease classification and localization, are used to provide task-specific evaluations of our rib suppression mechanism. We observed a 3.23% and 6.62% AUC increase, as well as 203 (1273 to 1070) and 385 (3029 to 2644) absolute false positive decreases for lung nodule detection and common lung disease localization, respectively. Conclusion: Through learning from image pairs generated from the physical model, the proposed SADXNet can make a robust sub-second prediction without losing fidelity. Quantitative outcomes from downstream validation further underpin the superiority of SADXNet and the training ML-based rib suppression approaches from the physical model yielded dataset. The training images and SADXNet are provided in the manuscript.


Introduction
Respiratory diseases are among the major causes of morbidity and mortality globally, and the prevalence of pulmonary diseases has steadily increased [1,2]. Timely diagnosis is critical for effective intervention. Among all imaging tools, a chest X-ray (CXR) is the most widely used for pre-screening thoracic anomalies [3]. Compared with CT, the downside of a CXR is the overlapping anatomies in 2D projections. The high-contrast bony structures are one of the major interferences in CXRs, obscuring the underlying soft tissues. Therefore,

Method
In this section, we will introduce the ST-smoothing method, FX-RRCXR dataset, SADXNet, and downstream validation of the NODE21 and ChestX-ray14 datasets.

ST Smoothing
The ST-smoothing algorithm assumes that the pixel intensities along one continuous contour of a rib are theoretically identical. As shown in Figure 1 S0, if the distances to the centerline of p 1 and p 2 are equal, these two points are deemed to be on the same contour and their corresponding rib intensities are equal.

Method
In this section, we will introduce the ST-smoothing method, FX-RRCXR dataset, SADXNet, and downstream validation of the NODE21 and ChestX-ray14 datasets.

ST Smoothing
The ST-smoothing algorithm assumes that the pixel intensities along one continuous contour of a rib are theoretically identical. As shown in Figure 1 S0, if the distances to the centerline of and are equal, these two points are deemed to be on the same contour and their corresponding rib intensities are equal.  (1) and (2).
ST transformation is a domain transformation used to generate a specific representation of a part of an image defined by the given closed cyclic contour : , ∈ 0, , 0 . is defined by its inverse as follows: where | | is the contour norm at . Figure 2a illustrates the transformation of contour C [7].  (1) and (2).
ST transformation T C is a domain transformation used to generate a specific representation of a part of an image defined by the given closed cyclic contour C : γ(t), t ∈ [0, C len ) , γ(0) = γ(C len ). T C is defined by its inverse as follows: where is the contour norm at γ(t). Figure 2a illustrates the transformation of contour C [7].  To reduce the computation burden, a continuous rib contour is considered as a piecewise linear contour (Figure 1 S0) for the discrete implementation of Equation (1). Figure  2b illustrates the implementation, with and formulated as: where is the sum of the length of all previous edges. The pixel intensities in the ST space are as follows: , ̂ , When it is assumed that a valid , always has a position , ̂ on the other side of the bone centerline , we can obtain from:

Rib Extraction via Partial Derivatives Smoothing in ST Space
The current subsection explains steps S1(a)→S2 in Figure 1.

Discrete Partial Derivative in ST Space
As shown in Equation (9), the first-order partial derivative is calculated along the axis in a discrete form to boost the overall computation. The definition of , represents the gradient orthogonal to the axis of a rib, which means that any structure oriented along axis does not contribute to the bone gradients.
Smoothing, Reintegration, and Transformation Back to the XY Domain Improved from von Berg et al. [7], Gaussian smoothing ( ) along the axis at and centerline smoothing ( ) along the axis at the reintegrated are implemented. First, since we hypothesize that signals along the axis of are independent from the To reduce the computation burden, a continuous rib contour is considered as a piecewise linear contour (Figure 1 S0) for the discrete implementation of Equation (1). Figure 2b illustrates the implementation, with s and t formulated as: To reduce the computation burden, a continuous rib contour is considered as a piecewise linear contour (Figure 1 S0) for the discrete implementation of Equation (1). Figure  2b illustrates the implementation, with and formulated as: where is the sum of the length of all previous edges. The pixel intensities in the ST space are as follows: , ̂ , When it is assumed that a valid , always has a position , ̂ on the other side of the bone centerline , we can obtain from:

Rib Extraction via Partial Derivatives Smoothing in ST Space
The current subsection explains steps S1(a)→S2 in Figure 1.

Discrete Partial Derivative in ST Space
As shown in Equation (9), the first-order partial derivative is calculated along the axis in a discrete form to boost the overall computation. The definition of , represents the gradient orthogonal to the axis of a rib, which means that any structure oriented along axis does not contribute to the bone gradients.
Smoothing, Reintegration, and Transformation Back to the XY Domain Improved from von Berg et al. [7], Gaussian smoothing ( ) along the axis at and centerline smoothing ( ) along the axis at the reintegrated are implemented. First, since we hypothesize that signals along the axis of are independent from the QQ || (4) To reduce the computation burden, a continuous rib contour is considered as a piecewise linear contour (Figure 1 S0) for the discrete implementation of Equation (1). Figure  2b illustrates the implementation, with and formulated as: where is the sum of the length of all previous edges. The pixel intensities in the ST space are as follows: When it is assumed that a valid , always has a position , ̂ on the other side of the bone centerline , we can obtain from:

Rib Extraction via Partial Derivatives Smoothing in ST Space
The current subsection explains steps S1(a)→S2 in Figure 1.

Discrete Partial Derivative in ST Space
As shown in Equation (9), the first-order partial derivative is calculated along the axis in a discrete form to boost the overall computation. The definition of , represents the gradient orthogonal to the axis of a rib, which means that any structure oriented along axis does not contribute to the bone gradients.
Smoothing, Reintegration, and Transformation Back to the XY Domain Improved from von Berg et al. [7], Gaussian smoothing ( ) along the axis at and centerline smoothing ( ) along the axis at the reintegrated are implemented. First, since we hypothesize that signals along the axis of are independent from the To reduce the computation burden, a continuous rib contour is considered as a piecewise linear contour (Figure 1 S0) for the discrete implementation of Equation (1). Figure  2b illustrates the implementation, with and formulated as: where is the sum of the length of all previous edges. The pixel intensities in the ST space are as follows: When it is assumed that a valid , always has a position , ̂ on the other side of the bone centerline , we can obtain from:

Rib Extraction via Partial Derivatives Smoothing in ST Space
The current subsection explains steps S1(a)→S2 in Figure 1.

Discrete Partial Derivative in ST Space
As shown in Equation (9), the first-order partial derivative is calculated along the axis in a discrete form to boost the overall computation. The definition of , represents the gradient orthogonal to the axis of a rib, which means that any structure oriented along axis does not contribute to the bone gradients.
Smoothing, Reintegration, and Transformation Back to the XY Domain Improved from von Berg et al. [7], Gaussian smoothing ( ) along the axis at and centerline smoothing ( ) along the axis at the reintegrated are implemented. First, since we hypothesize that signals along the axis of are independent from the P i+1 || (5) where t prev is the sum of the length of all previous edges. The pixel intensities in the ST space are as follows: When it is assumed that a valid (s, t) always has a position (ŝ,t) on the other side of the bone centerline c(t), we can obtain c(t) from:

Rib Extraction via Partial Derivatives Smoothing in ST Space
The current subsection explains steps S1(a)→S2 in Figure 1.

Discrete Partial Derivative in ST Space
As shown in Equation (9), the first-order partial derivative is calculated along the s axis in a discrete form to boost the overall computation. The definition of I d C (s, t) represents the gradient orthogonal to the t axis of a rib, which means that any structure oriented along axis t does not contribute to the bone gradients.
Smoothing, Reintegration, and Transformation Back to the XY Domain Improved from von Berg et al. [7], Gaussian smoothing (G t ) along the t axis at I d C and centerline smoothing (C) along the s axis at the reintegrated I r C are implemented. First, Diagnostics 2023, 13, 1652 5 of 13 since we hypothesize that signals along the t axis of I d C are independent from the ribs. A large Gaussian kernel t was used to smooth out the signals from the soft tissue and leave only the signals from the ribs in G t (I d C ). Note that t is a HP. Next, after excluding the soft tissue signal via G t (I d C ), we reintegrate towards the smoothed partial gradients to recover the bone signals I rC (s, t) in the ST domain, as shown in Equation (10).
Lastly, because T C has a singularity at the centerline c(t), an artificial edge was observed along c(t) , as shown in Figure 1 S2, after the reintegration of I r C . Therefore, a K-nearest neighbor (KNN) based centerline smoothing is applied along the s axis of I r C to smooth out the artificial edge, according to Equation (11).
where τ and k are two HPs and represent the threshold for conducting the KNN average and the number of neighbors used, respectively. After the above steps, we transfer the rib intensity C(I r C ) from the ST back into the image space under Equation (12) to exclude possible negative values, which are uninterpretable in the image space.

Rib Removal and Border Blending
We focus on Figure 1 S3 here. The initial rib-suppressed CXR I so f t C is acquired by subtracting the I bone C from the raw CXR, as follows: To improve the continuity between the rib boundary r b and its surrounding soft tissues, a KNN border smoothing function is applied to r b : where r b is defined under the ST space and shown in Equation (15). s b in Equation (15) and k in Equation (14) are two HPs.
Lastly, generating a complete soft tissue CXR requires iteratively repeating Sections 2.1.1-2.1.3 for each rib in a complete ribcage.

VinDr-RibCXR Dataset
VinDr-RibCXR is selected for creating the FX-RRCXR dataset using our modified ST-smoothing algorithm. VinDr-RibCXR is a dataset for automatic rib segmentation and labeling of SECXR scans. It contains 245 images with corresponding rib masks annotated by an expert. Each CXR scan has 20 separate rib annotations in the left and right lungs, respectively. The dataset was pre-split into training and validation sets, with 196 scans in the training set and 49 in the validation set. We refer the readers to Nguyen et al. [19] for more details.

FX-RRCXR Dataset Preparation
We applied the ST-smoothing algorithm to 245 images from the VinDr-RibCXR dataset. The HPs required by ST smoothing are tuned for individual images using the random grid search algorithm. Excluding the process for HP searching, the ST smoothing took 40-70 min to process a rib-removed scan, depending on the number of pixels within the ribcage. We organized the original CXRs as input and their corresponding rib-suppressed scans as GT while preparing the image pairs and kept the same training and validation split as the VinDr-RibCXR.

SADXNet
Since rib signals can be treated as noise superimposed on the rib-suppressed CXRs, we proposed the design of an image-denoising network, named SADXNet, to be trained on the FX-RRCXR dataset. Inspired by the architecture of DenseNet [23], SADXNet is designed to be densely connected, as shown in Figure 3. For each layer in SADXNet, the feature maps of all the preceding layers and their own feature map are fed into all the subsequent layers. The advantage of dense connection is the better alleviation of the vanishing-gradient problem, strengthening feature propagation, and encouraging feature reuse [23]. The composition of SADXNet is detailed below.
took 40-70 min to process a rib-removed scan, depending on the number of pixels within the ribcage. We organized the original CXRs as input and their corresponding rib-suppressed scans as GT while preparing the image pairs and kept the same training and validation split as the VinDr-RibCXR.

SADXNet
Since rib signals can be treated as noise superimposed on the rib-suppressed CXRs, we proposed the design of an image-denoising network, named SADXNet, to be trained on the FX-RRCXR dataset. Inspired by the architecture of DenseNet [23], SADXNet is designed to be densely connected, as shown in Figure 3. For each layer in SADXNet, the feature maps of all the preceding layers and their own feature map are fed into all the subsequent layers. The advantage of dense connection is the better alleviation of the vanishing-gradient problem, strengthening feature propagation, and encouraging feature reuse [23]. The composition of SADXNet is detailed below.
Dense connectivity. Figure 3 shows the layout of the densely connected SADXNet schematically. Each layer can be described by Equation (16): where , , … , represents the channel-wise concatenation of the feature maps produced from layer 0 to − 1 and (•) is a 1 1 convolution (Conv) to unify the number of feature channels of , …, . The output channel of is defined by ( ) , where represents the number of feature channels of . The purpose of (•) is to avoid the concatenated input feature maps, (•) being overly large in channel dimension exceeding the GPU memory.  Dense connectivity. Figure 3 shows the layout of the densely connected SADXNet schematically. Each layer can be described by Equation (16): where [x 0 , x 1 , . . . , x l−1 ] represents the channel-wise concatenation of the feature maps produced from layer 0 to l − 1 and f c l (·) is a 1 × 1 convolution (Conv) to unify the number of feature channels of x 0, . . . , x l−2 . The output channel of f c l is defined by L−1 , where C represents the number of feature channels of x l−1 . The purpose of f c l (·) is to avoid the concatenated input feature maps, N l (·) being overly large in channel dimension exceeding the GPU memory.
Composite function: We define N l (·) as a composite function of three consecutive operations: batch normalization (BN), rectified linear unit (ReLU), and 3 × 3 convolution (Conv). Pooling layers: Inspired by a pioneering study [8], we preserve the height (H) and width (W) dimensions in the feature maps as the shape of the input images throughout the Conv process without using down-or up-sampling layers.
Channel design: Overall, there are seven densely connected layers in the SADXNet. The channel of the convolutional kernel for each layer is designed in an increase-to-decrease setting to mimic the design of a fully convolutional network [24] in regard to channel dimension, so as to strike a balance between the model complexity (kernel with more channels) and training time (kernel with fewer channels).
Loss function: The cost function of the SADXNet is designed as a combination of the negative peak signal-to-noise ratio, the multi-scale structure similarity index measure [25], and the L 1 deviation measurement: where α and β in Equation (17) (19) are two variables to stabilize the division with a weak denominator having S as the dynamic range of the pixel values (typically 2 # bits per pixel − 1) and (k 1 , k 2 ) as the constants, and ||·|| 1 denotes the l 1 norm. Model training: SADXNet was implemented in PyTorch, and the training was performed on a GPU cluster with 4 × RTX A6000. Per the SADXNet training, we set the maximum epoch number to 200 and observed that the model converged at around 100 epochs. The Adam optimizer with an initial learning rate (LR) of 0.001 and a batch size of 1 × 4 were applied.
Evaluation metrics: The root mean square error (RMSE) is used for evaluating the SADXNet performance with the corresponding GT generated from ST Smoothing.

Downstream Clinical Task Validation
We quantified the benefits of rib-suppressed images with two experiments, including lung nodule detection and a general pulmonary disease classification and localization task based on the NODE21 [20] and ChestXRay14 [21] datasets, respectively. The details are set out below.
Datasets: NODE21 [20] encompasses 4882 CXR scans with the ratio of patient: volunteer = 1134:3748. Moreover, 5524 annotations were made to these images, with a maximum of three positive annotations for each scan. ChestX-ray14 [21] is a CXR set that has been text mined with fourteen lung diseases and bounding box annotations for 984 images in the pre-split test set. We extracted the 984 annotated images and then randomly sampled 3 × 984 healthy volunteers from the test set to construct a dataset for supervised detection training. For both datasets, we split the training and validation sets in the ratio of 7:3 and carefully balanced the proportion of positive and negative cases.
Detection network: Both tasks were performed in the Mask R-CNN pipeline and implemented in PyTorch [22], training the network separately on three different types of input with data augmentation of random scaling, random cropping, and random Gaussian blur, an LR of 0.001, a stochastic gradient descent optimizer, and a batch size of 2 × 4 on the 4 × RTX A6000 GPU cluster. Altogether, we ran 30,000 and 50,000 training iterations for NODE21 [20] and ChestX-ray14 [21], respectively.
Evaluation metrics: The area under the curve (AUC), the true positive predictions (TP), the false positive predictions (FP), and the false negative predictions (FN) are used as evaluation metrics. Figure 4 demonstrates a patient with lung nodules in the left lung and compares the node visibility with and without rib removal using ST smoothing. Compared to previous rib-suppression methods with edge residuals or artifacts near the lung borders [4,9], ST smoothing carefully avoids those two drawbacks. Additionally, ST smoothing preserved the shape and morphological details of the lung tumors. supervised detection training. For both datasets, we split the training and validation sets in the ratio of 7:3 and carefully balanced the proportion of positive and negative cases. Detection network: Both tasks were performed in the Mask R-CNN pipeline and implemented in PyTorch [22], training the network separately on three different types of input with data augmentation of random scaling, random cropping, and random Gaussian blur, an LR of 0.001, a stochastic gradient descent optimizer, and a batch size of 2 × 4 on the 4 × 6000 GPU cluster. Altogether, we ran 30,000 and 50,000 training iterations for NODE21 [20] and ChestX-ray14 [21], respectively. Evaluation metrics: The area under the curve (AUC), the true positive predictions (TP), the false positive predictions (FP), and the false negative predictions (FN) are used as evaluation metrics. Figure 4 demonstrates a patient with lung nodules in the left lung and compares the node visibility with and without rib removal using ST smoothing. Compared to previous rib-suppression methods with edge residuals or artifacts near the lung borders [4,9], ST smoothing carefully avoids those two drawbacks. Additionally, ST smoothing preserved the shape and morphological details of the lung tumors. The left-hand side shows the rib-removed scan, and the right-hand side shows the original unprocessed CXR image. Red arrows denote radiologically confirmed nodules. Figure 5 shows two subjects predicted by SADXNet. We found that the ribsuppressed scans predicted by SADXNet are visually indistinguishable from their corresponding GTs. Quantitatively, SADXNet achieves a 2.32 × 10 1.25 × 10 test RMSE. Most notably, compared to the time consuming ST-smoothing algorithm, SADXNet suppresses one scan in <1 s. The left-hand side shows the rib-removed scan, and the right-hand side shows the original unprocessed CXR image. Red arrows denote radiologically confirmed nodules. Figure 5 shows two subjects predicted by SADXNet. We found that the rib-suppressed scans predicted by SADXNet are visually indistinguishable from their corresponding GTs. Quantitatively, SADXNet achieves a 2.32 ± 0.13 × 10 −5 test RMSE. Most notably, compared to the time consuming ST-smoothing algorithm, SADXNet suppresses one scan in <1 s.

NOD21 Detection
According to Table 1, mixing raw and rib-suppressed images in training achieved the best detection scores than training on only a single source input. We also evaluated the mix-trained model on raw and rib-suppressed validation sets, respectively. Both scenarios achieved comparable performance, with slightly better outcomes for the raw CXR. Quantitatively, training using mixed images achieved approximately a 2-3% higher AUC, located more nodules (Figure 6 first row), and significantly reduced the FP (Figure 6 second row) than single-source images. Lastly, the performance of networks trained on a single image type is similar, with a slightly lower FP using rib-suppressed images.

ChestX-ray14 Classification and Localization
As shown in Table 2, training with mixed scans achieved the best performance across the three input combinations, resulting in around a 6-7% higher AUC than single-source trained detectors and largely reducing the FP predictions. Single-source trained models roughly reach similar performance, except that a model trained with rib-suppressed images makes fewer FP classifications. Figure 6 visually confirms the quantitative results.

NOD21 Detection
According to Table 1, mixing raw and rib-suppressed images in training achieved the best detection scores than training on only a single source input. We also evaluated the mix-trained model on raw and rib-suppressed validation sets, respectively. Both scenarios achieved comparable performance, with slightly better outcomes for the raw CXR. Quantitatively, training using mixed images achieved approximately a 2-3% higher AUC, located more nodules (Figure 6 first row), and significantly reduced the FP (Figure 6 second row) than single-source images. Lastly, the performance of networks trained on a single image type is similar, with a slightly lower FP using rib-suppressed images.

ChestX-ray14 Classification and Localization
As shown in Table 2, training with mixed scans achieved the best performance across the three input combinations, resulting in around a 6-7% higher AUC than single-source trained detectors and largely reducing the FP predictions. Single-source trained models roughly reach similar performance, except that a model trained with rib-suppressed images makes fewer FP classifications. Figure 6 visually confirms the quantitative results.
Diagnostics 2023, 13, x FOR PEER REVIEW 10 of 14 Figure 6. Two sample cases from the NODE21 test set in the first two rows, and two sample cases from the ChestX-ray14 test set in the third and fourth rows. The figure is organized into three columns: the first column shows predictions made using the model trained with raw CXRs only, the second column shows predictions made using the mix-trained model, and the last column shows Figure 6. Two sample cases from the NODE21 test set in the first two rows, and two sample cases from the ChestX-ray14 test set in the third and fourth rows. The figure is organized into three columns: the first column shows predictions made using the model trained with raw CXRs only, the second column shows predictions made using the mix-trained model, and the last column shows the GTs. Red boxes denote manually labeled bounding boxes for the disease location. Green boxes are results of automated detection. Moreover, since ChestX-ray14 is for multiple lung disease localization, we also present the AUC for each disease in Table 3. Despite the varying disease-specific performance, the improvements benefiting from mixed training are consistent with Table 2.

Discussion
Chest X-rays (CXRs) are highly accessible and cost effective for point-of-care pulmonary disease screening. However, the interpretation of a CXR can be difficult because structures overlap with each other in 2D projection. Specifically, tissue details can be obscured by bright ribs in a CXR due to their higher atomic number and density. We hereby combined physical and ML methods' strengths to overcome their weaknesses. We first used ST smoothing and the VinDr-RibCXR [19] dataset to create a benchmark dataset, FX-RRCXR, with 245 paired original and rib-suppressed CXRs. We then trained a supervised denoising network, SADXNet, achieving high-quality sub-second rib suppression. Lastly, we evaluated the quality of the FX-RRCXR dataset and SADXNet using two downstream tasks, including lung nodule detections using the NODE21 [20] dataset and fourteen lung disease classifications and localization using the ChestXray14 [21] dataset.
Our contribution is thus twofold. First, we used a physical model to generate a qualitative dataset that supports deep learning, which is released to the public as part of the paper. Second, we trained a fully automated supervised deep network that achieved physical model quality 1 × 10 4 times faster. The enormous gain in efficiency benefits large-scale testing of rib-suppressed CXR for various clinical applications, two of which were exemplified here, including the testing efficacy of lung nodule detection and benign disease classification. In both end-to-end tests, our quantitative testing showed significant improvement in diagnosis with rib suppression. To the best of our knowledge, this is the first study to demonstrate the benefit of combining a physical method with ML in end-to-end diagnostic testing.
The current study also demonstrated the pipeline's robustness. Generally, distributional shift poses a major challenge in clinical deployment [26]. Here, we found that although SADXNet was trained on FX-RRCXR, it still robustly suppressed the rib structures on scans from NODE21 and ChestX-ray14 (see Figure 6 middle column), yielding improved stability in clinically relevant tasks.
An interesting observation is that training a detection model on rib-suppressed CXRs alone does not improve model performance compared to training on original scans. In contrast, mixed training of two image sources can significantly reduce type I error while moderately increasing sensitivity. We attribute this fact to three considerations: (1) mixed training achieved better performance, namely a comparison in the region of interest (ROI) with and without the rib helped the learning model; (2) there was a significant decrease in type I error, namely assistance from the rib-removed scans makes the model more likely to avoid misidentifying noise in the rib structures, such as edges, as an ROI; and (3) there was a moderate reduction in type II error, namely the ribs only account for part of the superimposing anatomy in CXRs. Rib Suppression does not reduce interference from other anatomical and non-anatomical structures, including the heart, major vessels, mediastinum, and attached sensors. However, the current study may provide a roadmap for reducing additional interference for CXR-based diagnostic tasks.

Conclusions
We improved ST smoothing from von Berg et al. [7] and, based on that, further introduced a paired dataset, FX-RRCXR, that serves as a benchmark for supervised DL on CXR rib removal. Next, we proposed a denoising network, SADXNet, which learnt rib suppression by considering the ribs as noise on the lung tissues. Lastly, we validated the efficacy of rib suppression using two downstream tasks, including lung nodule detection and common lung anomaly classification and localization. The experimental results from the downstream tasks quantitatively substantiated the benefit of the FX-RRCXR dataset and SADXNet. Funding: This research was funded by NIH grant number R01CA259008 and DOD grant number W81XWH2210044.

Institutional Review Board Statement:
The study is based on publicly available data thus does not require IRB.

Informed Consent Statement:
Not applicable for studies based on anonymized public data.